S/Ob- (C 1 15/2.-)- 



II 



BnbIC LOG.- 



2) Publication number 



0 163 406 

A1 



0 



EUROPEAN PATENT APPLICATION 



© Application number: 85302744.9 
@ Date of filing: 18.04.85 



©Int Ci.-: C 12 N 15/00 

C 12 P 19/34, C 12 N 1/20 
C 12 P 21/02, C 07 K 7/40 



® Priority: 01.05.84 CA 453270 



(3) Date of publication of application: 
04.12.85 Bulletin 85/49 

@ Designated Contracting States: 
BE CH DE FR GB IT LI NL SE 



(J) Applicant: Canadian Patents and Development Limited 
Soclete Canadienne des Brevets et d'ExploHetton 
Umftee 

275 Slater Street 

Ottawa Ontario, K1A 0R3(CA) 

@ Inventor: Garvin. Robert T. 
33 Deforest 

Toronto Ontario M6S Ul(CA) 

@ Inventor: Shen. Shi-Hsiang 
15 Rockford Drive Apt. 1105 
Willowdale Ontario M2R 3A3ICA) 

(7?) inventor: James, Eric 

3303 Don Mills Road Apt. 2903 
Willowdale Ontario M2J 3A3ICA) 

@ Representative: Burford, Anthony Frederick et al, 

W.H. Beck, Greener & Co. 7 Stone Buildings Lincoln's Inn 
London WC2A3SZIGB) 



< 

O 

CO 
ID 



o 

0. 

UJ 



(w) Novel polypeptide expression method. 

@ The stability of polypeptides, such as proinsulin, soma- 
tostatin and interferon, produced in microorganisms trans- 
formed by cloning vehicles containing a DNA sequence 
coding for the polypeptide, such that the polypeptide may be 
recovered as a product of the microorganism, is significantly 
improved by inserting into the cloning vehicle a DNA 
fragment which codes for tandemly-linked multiple copies of 
the polypeptide gene joined by linking sequences which 
code for easily-cleavable amino acid sequences. Stop and 
start codons are absent between the ends of the fragment, so 
that the whole fragment is expressed. High yields of the 
multiple copy (polymeric) product are obtained and may be 
cleaved at the contrived joins to provide single polypeptides. 
Novel plasmids for the specific example of human proinsulin 
are described. 
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The present invention relates to the expression 
of polypeptides from microorganisms. 

' Bacteria, in common with mammalian ana other 
eUC aryotic cells, possess a proteolytic system for the 
e ctive degradation of abnormal or no longer : ; ed 

m the natural course of events, 
polypeptides. In the mutat ion or 

abnormalities may arise by eitne 

^ o-rror Alternatively, a normal 

biosynthetic error. 

peptide W be needed ""J*^ 

while, at any once 
polypeptide is deleterious to the cell, 
poiypep carr ied out its function, it is 

the Dolvpeptide has earner 

the poiyp Th operations imply a 

epipctively destroyed. inese ^ 

selective y inter acting regulatory and 

complex network Escherichia coli, 

structural elements, ^' hTlHferent 
the best studied bacterial system, eig 
the cesi described which 

endoproteases have so far be 
function, individually or in concert, 
selective degradation system. 

seiec a _ „ f foreiqn genes into a 

The introduction of foreign y 

mi croorganism often has inadvertently activated this 

tetlvtic system, thereby inhibiting the large scale 
proteolytic syste , ^ practice 

production of a desired gene P . , 

of genetic engineering. This problem is P*^ 1 *^ 
a ute when genes coding for eucaryotic peptides are 
introduced into the bacteria. The strategy for 

1 . nroblem has been to fuse the 

circumventing this prooiem 

f or the desired gene 
nucleotide sequence codm g for 

product to at least part of a structural ,«.. or 
carrier protein 9 ene, whose product i. known to be 

constitutively stable. . . 

The tandem arrangement of desired 9 ene product 
, and carrier protein results in the expression fro, the 
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organism of a hybrid polypeptide which is stable, 
provided that the carrier protein is large enough to 
identify the entire hybrid polypeptide as both native 
and constitutively stable. The prior art difficulty 
of direct expression of eucaryotic polypeptides and 
the necessity to use a carrier protein to achieve 
expression is well recognized and has been described 
in a number of scientific papers and published patent 
applications and patents, for example, in U.S. Patent 
No. 4,342,832. A typical patent which describes the 
use of this technique, for the production of 
somatostatin, is U.S. Patent No. 4,366,246. 

A disadvantage to this approach, however, is that 
the desired product constitutes only a small portion 
of the hybrid polypeptide, thereby decreasing 
significantly the theoretical yield, and necessitating 
further processing to split off the native carrier 
protein and to purify the desired polypeptide product. 
Efforts to decrease the native carrier moiety to a 
small portion of the hybrid product in an attempt to 
improve the yield, however, render the expressed 
hybrid polypeptide unstable. 

The symbols, abbreviations and some of the terms 
used herein have the following meanings: 



DNA - deoxyribonucleic acid 

A - Adenine 

T - Thymine 

G - Guanine 

C - Cytosine 

Tris - 2-Amino-2-hydroxy-ethyl-l , 3-propane 

diol 

EDTA - Ethylene diamine tetraacetic acid 

ATP - Adenosine triphosphate 

TTP - Thymidine triphosphate 

GTP - Guanosine triphosphate 

T4 - Bacteriophage specific to E . coli 

Nucleotide - Nucleoside phosphate 
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Codon - Three nucleotide sequences specifying 

a particular amino acid 
Oligonucleotide- Polymer consisting of a number of 

nucleotides 

Plasmid - Self-replicating, extrachromosomal 

genetic element 
In accordance with the present invention, it has 
now surprisingly been found that this prior art 
problem can be satisfactorily overcome and the 
natural, selective proteolytic mechanisms of the 
microorganisms can be circumvented to achieve high 
yield production of a desired eucaryotic polypeptide. 
This is achieved by tandemly-linking a plurality of 
polypeptide genes through easily-cleavable amino 
sequences, incorporating the resulting DNA construct 
into a cloning vehicle, and transforming a 
microorganism with the cloning vehicle. 

In accordance with the present invention, 
therefore, there is provided a novel DNA fragment 
having an oligonucleotide sequence coding for a 
polypeptide composed of repeating units of a desired 
product of known amino acid sequence separated by an 
oligonucleotide coding for an easily-cleavable amino 
acid sequence, wherein stop and start codons are 
absent between the ends of the fragment. 

By providing a DNA fragment composed of repeating 
its coding for a desired product of known amino acid 
sequence and separated by oligonucleotides coding for 
easily-cleavable amino acid units, a replicable 
cloning vehicle can be provided which is capable of 
stably expressing a product in a microbial organism. 
Once the product has been expressed during growth of 
the microorganism, it may be isolated from the cell 
culture, and the desired polypeptide product is 
obtained from the isolated product by cleaving the 
repeating units at the cleaving sites to form a 
plurality of individual units of the desired product. 
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Since multiple copies of the desired polypeptide 
product are expressed from the microorganism, the 
yield of product is considerably enhanced with respect 
to the prior art expression procedure discussed above. 

The present invention has general applicability 
to the enhancement of stability of polypeptides in 
microorganisms and may be used to produce a wide 
variety of polypeptide products, comprising hormone 
polypeptides, including human and animal source 
polypeptides, for example, human proinsulin, human 
growth hormone, somatostatin and human interferon, and 
viral proteins for use in vaccines. 

Essential to the present invention is the 
formation of a DNA fragment which contains at least 
two units each coding for a desired product of known 

_ * * 



amino 



acid, sequence, separated by a joining unit 
whose product is readily cleaved. The number of 
repeating units which is required to achieve stability 
depends on the desired product, while the nature of 
the joining unit and the cleaving agent used for 
cleaving also depend on the desired product. 

The term "stability" as used herein with respect 
to the expression products refers to a material that 
has a half-life of sufficient duration to enable 
extraction thereof to be effected. It is preferred to 
provide by this invention products having a half-life 
greater than about 200 minutes. 

The specific embodiment of the invention relates 
to the expression of human proinsulin, i.e. the 
precursor of insulin containing the C-chain in 
addition to the A and B chains, but the principles 
thereof are widely applicable to any polypeptide, as 

discussed above. 

in the following description, reference is made 

to the accompanying drawings, in which: 
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Figure' 1 is a schematic illustration cf the 
formation of recombinant plasndds in accordance «th 

this invention? 

Figure 2 is a derivation and restricts site map 

of plasmid plac 239-3PI; 

Figure 3 is a derivation and restriction .ite map 

of plasmid p Tac /5PI? 

Figure 4 illustrates the sequence of steps to 

form plasmid pAT/PIR; 

Figure 5 is a derivation and restriction site map 

of plasmid pAT/PIR; 

Figure 6 illustrates the sequence of steps to 
form plasmid plac 504/PI, and formation of plac 239-PI 

and pTac/PI therefrom; 

^ ure 7 is a derivation and restriction site map 

of plasmid plac 504/PI; . 

Figure 8 illustrates formation of plasmid plac 

239-PI from plac 504/PI; and 

Figures 9, 10 and 11 are photographs of 
electrophoresis experiments conducted on various 
products produced in accordance with this invention 

in the utilization of the present invention to 
effect the production of human proinsulin in 
accordance with the preferred embodiment hereof, a DNA 
fragment is formed containing a plurality of sequences 

i- nrninsulin separated oy 

coding for human proinsuxin *> 

ji„„ for easilv-cleavable 
oligonucleotide sequences coding for easily c 

units, which, in the final product, permit ready 
^paration of the individual proinsulin peptides one 
from another. Sequences coding for stop and start 
signals are absent from the fragment between its ends. 
Th e DNA fragment is inserted into a replicable cloning 
vehicle which is capable of expressing the fragment in 

a microbial organism. 

j _„4-h/-Mq of constructing the multiple 
The preferred method oi consuui-i. 

proinsulin gene and inserting the san,e into a cloning 
vehicle, in both , fused and an unfused conf igurat.on. 
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is illustrated in Figure 1. In the fused 
configuration, the human proinsulin coding sequence is 
joined to the 3 '-side of a fragment containing the lac 
promoter and encoding the first 80 amino acids of the 
N-terminus of B -galactosidase . In the unfused 
configuration, the proinsulin coding sequence is 
linked directly to a fragment containing the Tac 
promoter/ followed by a bacterial Shine-Delgarno 
sequence . It has been found that the yield of 
expressed polypeptide, in terms of grams per cell of 
microorganism, is significantly greater in the case of 
the fused configuration, as compared with the unfused 
configuration. 

In both configurations, the single gene human 

proinsulin is too unstable to be isolated as a product 
from the microorganism but, as additional human 
proinsulin coding sequences are added to the genetic 
construct by the method of Figure 1, the resultant 
multiple domain polypeptide becomes increasingly 
stable, approaching normal stability at three 
tandemly-linked proinsulin domains in the unfused 
configuration, and at two tandemly-linked proinsulin 
domains in the fused configuration. As increasing 
numbers of proinsulin coding sequences are tandemly 
linked, a maximum number of coding sequences is 
reached, beyond which the yield of polymer declines. 
For example, increasing the number of tandemly-linked 
units in the unfused configuration for proinsulin 
beyond six, results in a decreased amount of polymeric 
product being produced by the cell. It is preferred 
to utilize 4 to 5 proinsulin coding sequences in the 
unfused configuration and 3 to 4 proinsulin codiog 
sequences in the fused configuration. 

The oligonucleotide sequence for the linking unit 
or domain junction in the embodiment illustrated in 
Figure 1 is represented as follows: 
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Proinsulin gene-Arg Arg Asn Ser Met-Proinsulin gene 

AGG CGC AAT TCT -ATG 

The amino acid domain junction is easily cleavable, by 
reaction with cyanogen bromide, into single proinsulin 
peptides. This domain junction and its manner of 
cleaving are but one example of a variety of linking 
units and modes of cleaving which may be adopted. 

For those polypeptides which contain no 
methionine group, cyanogen bromide cleavage may be 
used to produce the native material directly, provided 
that a methionine codon constitutes the peptide coding 
junction, while for those polypeptides which do 
contain the methionine group, the design of a 
particular amino acid sequence positioned between 
desired polypeptide segments to provide an 
easily-cleavable site may be effected by genetic 
engineering methods. For example, polypeptides which 
do not contain an aspartyl-proline bond may be 
engineered to contain such a bond between each unit of 
the multi-coding sequence construct, so that recovery 
of the native material from the isolated multi-domain 
polypeptide may be achieved following cleavage by 
acids, as described by Jauregui-Adell and Marti, Anal. 

Biochem. 69, 468 (1975) . 

In general, enzymatic methods or chemical 
reactions which are amino acid residue-sequence 
specific may be used to effect cleavage of linking 
groups between the desired polypeptide genes. 

The authentic human proinsulin molecule contains 
no methionine residues, so that the cyanogen bromide 
treatment cleaves a multi-domain proinsulin molecule 
into several proinsulin analog moieties along with the 
C-terminal authentic proinsulin unit. This results 
from the designed placement of methionine codons at 
the 5 '-end of each of the proinsulin coding sequence 

units . 
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The individual proinsulin units derived in this 
manner may be further processed into authentic human 
insulin by digestion with trypsin and carboxypeptidase 
B, or by any other convenient procedure. 

The formation of a recombinant plasmid containing 
an oligonucleotide sequence coding for multiple copies 
of human proinsulin leads to a considerable increase 
in yield of proinsulin and proinsulin analogs, when 
compared with the expression in the prior art of the 
hybrid molecule containing the ft -galactosidase gene. 

As seen in Figure 1, the preferred sequence to 
form the novel plasmids pTac/2PI and plac 239/2PI, and 
other multiple PI unit plasmids especially the 
expression plasmids p Tac /5PI and plac 239/3PI which is 
further engineered as illustrated in Figure 8 to form 
the preferred expression plasmid plac-3Pl, starts from 
the plasmid pAT/PIR which contains an oligonucleotide 
sequence between the EcoRI and BamHI restriction sites 
coding for human proinsulin. The derivation from the 
known plasmids pBCA4 of the plasmid pAT/PIR, which is 
also a novel plasmid, is illustrated in Figure 4. The 
p Tac and plac 239 fragments are derived from plasmids 
pTac/PI and plac 239/PI respectively, which in turn 
are both derived from another novel plasmid, plac 
504/PI, as shown in Figure 6. The derivation of 
plasmid plac 504/PI from known plasmids is also shown 
in Figure 6. 

It is not essential to obtain the p Tac and plac 
vectors from the plasmids p Tac /PI and plac/PI i.e. 
plasmids containing a single copy of the proinsulin 
gene. They may be derived from any other convenient 
plasmid which contains the Tac and lac prqmotor 
respectively, and which can be cleaved at EcoRI and 
Bam HI sites to form the vector for ligation with the 
multiple proinsulin gene fragment. Similarly, it is 
not essential to form the plasmid plac 504/PI, i.e. 
also containing the single copy of the proinsulin 
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gene, as an intermediate, but any other convenient 
.intermediate plasmid may be employed. The particular 
single copy proinsulin gene plasmids here embodied 
were formed by the inventors during the 
experimentation which led to this invention in an 
attempt to provide constructs from which proinsulin 
would be stably expressed. As noted earlier, this was 
not found to be possible with the single copy gene. 

Specific experimental details for the formation 
of the various plasmids set forth in the drawings and 
disclosed above are discussed in the Examples below. 

Detailed derivations and major restriction site 
maps for the novel plasmids plac 239-3PI, pTac/5PI, 
pAT/PIR and plac 504/PI appear as Figures 2, 3, 5 and 
7 respectively. E.coli JM103 modified by each of the 
plasmids plac 239-3PI and pTac/5PI, have been 
deposited on July 6, 1984 with the American Type 
Culture Collection located at Rockville, Maryland, 
U.S.A., and have been accorded the following accession 
Nos . : 

Organism E . col i /plasmid plac 239-3PI - ATCC 39760 
Organism E . col i /plasmid pTac/5PI - ATCC * 39 7 59 . 
The invention is illustrated further by the 

following Examples. 

• Example 1 

This Example describes the formation of novel 
plasmids containing multiple units of the human 
proinsulin gene. 
(a) Preparation of pAT/PIR 

The plasmid pAT/PIR was prepared following the 
sequence outlined in Figure 4 and the restriction map 
for the plasmid appears as Figure 5. The starting 
plasmid pBCA4 contains a nucleotide sequence coding 
for human proinsulin between its EcoRI and BamHI 
restriction sites. However, if the EcoRI-BamHI 
fragment of this plasmid is used directly, the wrong 
reading frame results and hence must be modified as 
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described below. Plasmid pBCA4 is described in Narang 
et al, Gene, 17 , 279 (1982) and was obtained from 
Professor R. Wu, an author of that paper of Cornell 
University, Ithaca, N.Y., U.S.A. 
5 The plasmid pBCA4 was cut with Fokl and BamHI and 

the Fokl-BamHI fragment was isolated. Two synthetic 
oligonucleotides were created and ligated with the 
Fokl-BamHI fragment to extend the 5' -end of the PI 
sequence from Fokl to EcoRI , the altered EcoRI-ended 
10 PI sequences being designated PIR. The synthetic 

nucleotide sequences were generated chemically and 
comprised the sequences: 

5 1 AATTCTATGTTTGTCAAT 3 • 

3 ' GATACAAACAGTTAGTCG 5 9 

15 The starting plasmid pAT153 is a high copy 

variant of the well-known plasmid pBR322 with the 
so-called "poison" sequences removed and is described 
by Twigg et al, Nature, 283, 216 (1980). The plasmid 
pAT153 was obtained from Professor O. Smithies, Dept. 
of Genetics, University of Wisconsin - Madison, 
Madison, Wisconsin, U.S.A. The plasmid pAT153 was cut 
with EcoRI and BamHI and the vector isolated. The 
isolated vector was ligated to the EcoRI-BamHI (PIR) 
fragment, to form the new plasmid pAT/PIR. 
25 (b) Preparation of Multiple Copies of Proinsulin Gene 

The sequence of operations required to form 
multiple copies of the proinsulin gene is outlined in 
Figure 1. Synthetic oligonucleotides (A) and (B) were 
prepared chemically to provide the following 
30 sequences: 

(A) 5' CCTCTACCAGCTGGAGAACTACTGCAACAGGCGC 3' 

(B) 3' ATGGTCGACCTCTTGATGACGTTGTCCGCGTTAA 5* 

■ 

200 p. moles of oligonucleotide B and 200 p. moles of 

oligonucleotide A having its 5 1 -end labelled with 
3 2 

35 P-ATP, were mixed in 30 ul of 66 mM Tris-HCl (pH 

7.6), 6.6 mM MgCl 2 and 10 mM dithiothrei tol , and 
heated at 80°C for 15 minutes to anneal the 
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oligonucleotides to form a part of the human 
proinsulin coding sequence beginning at the SfaNI cut 
site near the 3' -end of the coding sequence and 
terminating with the last proinsulin codon (AAC) , 
5 followed by the additional sequences: 

5 1 AGG CGC 3 ' 

3' TCC GCG TTAA 5 1 

These additional sequences of the synthetic 
oligonucleotides encode two arginine residues and 

10 provides a sticky end for ligation to an EcoRI sticky 

end. The ligation destroys the EcoRI site because of 
the CG base pair (boxed in Figure 1) . 

10 jag of pAT/PIR, prepared as described in (a) 
above, was digested for 1 hour at 37°C with 40 units 

15 each of EcoRI and BamHI in 40 jal of digestion buffer. 

The digested DNA was dephosphory lated by digestion at 
37°C for 30 minutes with 4 units of calf intestine 
alkaline phosphatase in 200 mM Tris-HCl, pH 8.0. The 
proinsulin gene fragment (PIR) was isolated from 1% 

20 agarose gel by phenol extraction. The isolated 

fragment was further digested for 3 hours at 37°C with 
6 units of SfaNI in 40 yul of digestion buffer and the 
EcoRI-SfaNI fragment of PIR was isolated. 

100 p. moles of the annealed oligonucleotides (A) 

25 and (B) and approximately 4 p. moles of the isolated 

PIR EcoRI-SfaNI fragment were ligated for 14 hours at 
15°C in 30 /al of 66 mM Tris-HCl (pH 7.6), 6.6 mM 
MgCl 2 , 10 mM dithiothreitol and 1 mM ATP using 20 
units of T4 DNA ligase. As a result of 

30 dephosphorylation of the proinsulin gene fragment 

prior to SfaNI cleavage, and phosphorylation of only 
oligonucleotide- (A) , as described above, the reaction 
was unidirectional, resulting in ligation of the 
proinsulin gene fragment PIR to the annealed 

■ 

35 oligonucleotide. The ligation reaction product was 

electrophoresed on 1.5% agarose gel and the resultant 



12 



01 63406 



proinsulin gene analog construct (PI Analog) was 

isolated from the gel. 

The isolated proinsulin gene analog sequence 
(approximately 3 p. moles) was phosphorylated with ATP 
and T4 polynucleotide kinase. 10 yug of plasmid 
pAT/PIR, prepared as described in (a) above, was- 
digested for 1 hour at 37°C with 40 units each of 
EcoRI and BamHI . Approximately 3 p. moles of the PIR 
proinsulin gene sequence were isolated from agarose 
gel. The isolated proinsulin gene sequence and 
phosphorylated proinsulin gene analog sequence were 
ligated at 37°C for 3 hours in 30 yul of ligation 
buffer containing 20 units of T4 ligase, 40 units of 

EcoRI and BamHI . 

The ligation mixture was run on 1.0% agarose gel 
and, after autoradiography, the dimer, i.e. two 
proinsulin coding sequences, and trimer, i.e. three 
proinsulin coding sequences, were isolated from the 
gel by phenol extraction. The tetramer, pentamer and 
hexamer, i.e. four, five and six proinsulin coding 
sequences respectively, were obtained by further 
ligating the isolated fragments with additional 
proinsulin gene analogs following the general 
procedure described above. 

(c) Preparation of pTac and plac 239 Recombinants 
containing Multiple PI Units 

The various PI fragment constructs, containing 

from two proinsulin coding sequences up to six 

proinsulin coding sequences, were inserted into the 

vectors pTac and plac 239 by ligation, as generally 

described in (b) above. The p Tac and plac 239 vectors 

were formed by digestion with EcoRI and BamHI of 

pTac/PI and plac 239/PI respectively, which, in turn, 

were formed. following the procedures which are 

outlined below in sections (d) and (e) . By inserting 

a proinsulin gene trimer into the plac 239 vector and 

a proinsulin gene pentamer into the p Tac vector, there 
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were obtained the fused and unfused multiple 
proinsulin gene plasmids plac 239/3PI and pTac/5PI 
respectively. The derivation and main restriction 
site map for the latter new plasmid appears as Figure 
3. 

In similar manner, other multiple proinsulin gene 
plasmids, including plac 239/4PI, plac 239/5PI, 
p Tac /3PI , p Tac /4PI , and p Tac /6PI were constructed from 
the various multimer proinsulin genes and the plac 239 
and p Tac vectors. The various constructs, containing 
from 1 to 6 copies of the proinsulin coding sequence, 
were confirmed by restriction mapping and DNA sequence 
analysis . 

(d) Preparation of plac 239-PI 

The derivation of plasmid plac 239-PI is 
illustrated in Figure 6 and was prepared from plasmid 
plac 504/PI, as shown in detail in Figure 8. The 
formation of plasmid plac 504/PI is described in more 
detail below in section (f). The plasmid plac 504/PI 
has a lac promoter, the f> -galactosidase gene (z-gene) 
complete to the EcoRI site near the carboxy terminal 
of the gene with the human proinsulin gene insert of 
plasmid pBCA4 fused at this EcoRI site, so as to 
remain in reading frame, such that the translation of 
the z-gene continues directly to code for proinsulin. 

Referring to Figure 8, plasmid plac 504/PI was 
digested with Mstll and the resulting sticky ends were 
filled in with dTTP and dGTP. The S'-T's were then 
removed by digestion with mung bean nuclease, the 
fragment cut with EcoRI to remove the Mst II/EcoRI 
segment of the z-gene, the sticky ends of EcoRI were 
filled in with dATP and dTTP, and the plasmid was 
blunt-end ligated to establish a new EcoRI site, the 
resulting plasmid bei-ng designated as plac 239/PI. 
The plasmid was cut with EcoRI at the new site, the 
fragment EcoRI tails were digested with mung bean 
nuclease, and the resulting fragment blunt-end ligated 
to form the plasmid plac 239-PI in which the 
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correct PI reading frame was restored. In order to 
restore the correct PI reading frame in other fused 
plasmids, for example, plac 239/3PI, the EcoRI tails 
should also be removed by the same procedure described 
above, to form the corresponding expression plasmid 
plac 239-3PI. The derivation and main restriction 
site map for plac-3PI appears as Figure 2. 
(e) Preparation of pTac/PI 

The preparation of plasmid pTac/PI is illustrated 
in Figure 6 and was effected from plac 504/PI and from 
plasmid pDR540. The plasmid pDR540 is described by - 
Russell et al in Gene, Vol. 20, 231 (1982) and has the 
following features: a Hindlll site upstream from the 
promoter, a -35 region from trp , the -10 operator, 
Shine-Delgarno , and Shine-Delgarno- to-start spacing 
regions of lac , with a BamHI site positioned at the 
start. The plasmid pDR540, purchased from P-L 
Biochemicals , Milwaukee, Wisconsin, U.S.A., was cut 
with BamHI and Hindlll, the sticky ends removed with 
mung bean nuclease and the isolated Tac promoter was 
inserted by blunt-end ligation into the isolated 
Hindlll-EcoRI fragment of plasmid plac 504/PI which 
contains the PI gene, by first digesting plasmid 
plac/PI with Hindlll and EcoRI and then filling in the 
ends with DNA polymerase I. In this way, the Tac 
promoter of the plasmid pDR540 is substituted for the 
wild-type lac promoter and the z-gene portion of plac 
504/PI, thereby preserving the sequence, spacing and 
properties of the hybrid promoter, but with an EcoRI 
sequence appropriately located. 
(f) Preparation of plac 504/PI 

Plasmid plac 504/PI was prepared following 'the 
sequence of steps set forth in Figure 6 and the 
derivation and main restriction sites for the plasmid 
are illustrated in Figure 7. 

The initial step in the formation of plasmid plac 
504/PI was the formation of plasmid plac 19 from known 
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plasmids pMC1396 and Ml3mp7 . The plasmid pMC1396 is 
described by Casadaban et al in J.Bact., Vol. 143, 
971, (1980) while the plasmid Ml3mp7 is described by 
Messing et al in Nucleic Acids Research, Vol. 9, 309 
(1981). Plasmid M13mp7 was purchased from 3RL Inc., 
Gaithersburg, MO, U.S.A. and plasmid pMC1396 was a 
gift of M. Casadaban of Cambridge, MA, U.S.A. 

Plasmid pMC1396 was digested with Smal, treated 
with Bal 31 and ligated with Hindlll linkers to form 
plasmid plac 9. Plasmid pMl3mp7 was digested with 
Avail and ligated to a Hindlll linker to form plasmid 
plac 1. Both plasmid plac 9 and plac 1 were digested 
with Hindlll and ligated to form the plac 19 plasmid. 

Plasmid pBCA4, containing the sequence coding for 
human proinsulin, was digested with EcoRI and Sail and 
the EcoRI - Sail fragment was isolated. Plasmid plac 
19 was digested with EcoRI and Sail and the resulting 
vector was ligated with the EcoRI-Sall fragment of 
plasmid pBCA4 to form plasmid plac 19/PI. 

Plasmid pBGP120 contains the lac promoter and is 
described in Polisky et al, Proc. Nat. Acad. Sci. Vol. 
73, 3900 (1976) . This plasmid, obtained from Dr. R. 
Wu of Cornell University, Ithaca, N.Y., U.S.A., was 
digested with EcoRI and Hindlll and the Hindlll-EcoRI 
fragment containing the lac promoter was separated. 
Following digestion of the placl9/PI plasmid with 
EcoRI and Hindlll, the resulting vector was ligated 
with the Hindlll-EcoRI fragment of plasmic pBGP120 to 
obtain the plasmid plac 504/PI. 
Example 2 

This Example illustrates the preparation of 
converted microorganisms from the novel plasmids 
containing multiple proinsulin genes and expression of 
the polyproinsulin polypeptide. 

(a) Transformation and Growth of Cells 
The novel plasmids plac 239-nPI, where n is 2 to 
5, and pTac/mPI , where m is 3 to 6 , were used to 
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transform the strain of E . coli known as JM103. This 
strain is described by Messing et al in Nucleic Acids 
Res., 2' 309 (1981) and is characterized by the 
genotype (lac pro) , thi, strA, supE, endA, sbcB, 
hsdR, F traD36, proAB, lacl q , Z M15. The use of this 
E.coli strain was for convenience only and other 
E. coli strains and indeed other microorganisms could 
have been used. 

Transformed strains were grown -in YT medium 
containing 50 pq/ml of ampicillin. YT medium 
contains, per litre, 8 g tryptone, 5 g yeast extract, 
5 g NaCI, and 1 ml IN NaOH (to pH 7.0 to 7.2). In 
order to induce the expression of the polyproinsulin 
polypeptide from the plasmid, 

isopropyl-B-D-thiogalactoside (IPTG) was added to 
provide a final concentration of 1 mM when the cell 
density had reached an optical density of 0.1, 
measured at a wavelength of 560 nm. 

(b) Recovery of Expressed Polypeptide 
After induction by IPTG, cell growth was 
continued for about 8 to 12 hours. The cells were 
harvested by centrif ugation , washed with 

phosphate-buffered saline, and either analyzed 
directly by SDS-PAGE (sodium dodecyl 

sulphate-polyacrylamide gel electrophoresis) , as 
described by Laemmli in Nature, 227 , 680 (1970), or 
further purified. Such further purification was 
effected by suspending washed cells from 3 ml of 
culture in 1.0 ml of T buffer (i.e. 50 mM Tris HC1 at 
pH 7.9 containing 25% sucrose, 1% Nonidet-P40, 0.5% 
sodium deoxychlolate and 5mM EDTA) and subjecting the 
cells to sonic disruption by two 60 second bursts at a 
wavelength of 12 jam at 0°C. The sonicated material 
was centrifuged for 10 minutes in an Eppendorf 
microcentrifuge at 4°C. The centrifuged pellets were 
dissolved in the sample buffer and analyzed by 
SDS-PAGE. 
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i£l _Analysis^L^ 
Cons tructs 

^o^oretic anaiysis of »ltiply-«pr.«." 
proln-lin polypeptide in the unfused cent patron 
i. shown in Figure 9. Proteins, partiaUy purged as 

-i a kw is*. qns-PAGE. Lanes 
described above, were analyzea by 15% SDS 

! to 5 represent the products equivalent to 500 pi o. 
original culture of JM103 cells harbouring the 
plasmids pTac/PI, P Tac/2PI, P Tac/3PI, pTac/4PI ana 
pTac/5PI, while lane M contains protein molecular 
weight markers. 



Electrophoretic analysis of multiply-expressed 
proinsulin polypeptide in the fused configuration x. 
shown in Figure 10. Lanes A, B and C Resent the 
total cell protein equivalent to 150 pi of origx».l 
culture of JM103 cells harbouring the plasmids plac 
239-PI, Plac 239-2PI and plac 239-3PI while lanes D, E 
and F were the same products as lanes A, B and C , 
except that the products were partially 
described above. Lane M contains protein molecular 

weight markers. 

The amount of product obtained per cell of 
mi croorganism was not determined quantitatively but 
was qualitatively observed to be significantly greater 
for the multiple gene fragments when compared wxth 
that known to be typically obtained when expressing 
hybrid polypeptides in the prior art. 

?XamP Tn is Example illustrates the isolation and 
purification of polypeptide fractions and the 
formation of insulin therefrom. 
ia) Recov er y of Pr oinsulin 

About 7 g of cells were . suspended in T buffer 
using 10 ml/g and were sonically disrupted at 0°C 
using two 90 second bursts at a wavelength of 18 pin. 
Following centrif ugation at 12,000 g for l'*™^ 9 ' 
the pellet was suspended in 25 mM Tris-HCl, 2 M NaCl. 
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4 M urea at pH 8.4, using 5 ml/g of cells, resonicated 
and recentrifuged at 12,000 g. In order to solubilize 
the polyproinsulin, the pellet was resuspended in GDFT 
buffer (7 M guanidinium chloride, 100 mM sodium 
formate, pH 3,0) using 5 ml/g cells and gently stirred 
for 1 hour. The polyproinsulin contained in the GDFT 
buffer was precipitated by dilution into 6 volumes of 
ice-cold water, and collected by low speed 
centrif ugation at 5,000 g.- The resultant pellet 
contained essentially pure proinsulin polypeptide 
polymer, as determined by SDS-PAGE (see Fig. 11, Lane 
B) . 

Purified proinsulin polymer was dissolved in 70% 
formic acid using 10 ml/2 g cell equivalents, digested 
with cyanogen bromide at 50 mg per mg of protein for 
35 hours at room temperature, and the product analyzed 
by SDS-PAGE . Lane A in Figure 11 shows the cleaved 
product of the five tandem proinsulin polypeptide. 
Lane M contains protein molecular weight markers. A 
Western blot of the SDS-PAGE gel of Lanes A and B 
showed the human proinsulin-sized material to be 
reactive with porcine anti-proinsulin sera. 

(b) Production of Insulin 

Recovered cyanogen bromide digest product was 
sulfitolyzed by dissolving the material in GE buffer, 
containing 6 M guanidinium HC1 and 0.2 M ethanolamine 
(pH 9), at 10 mg/ml, adding Na 2 S0 3 and Na 2 S 4 0 6 to 
concentrations of 0.4 M and 0.8 M respectively, and 
allowing the reaction to occur for 2 hours at room 
temperature. The sulfitolyzed product was 

precipitated by dialysis against 10 mM ammonium 
acetate (pH 4.5). The precipitate was dissolved at 4 
mg/ml in TF buffer, containing 25% formamide and 50 mM 
Tris-HCl (pH 8.1) and the solution loaded onto a ' DEAE 
cellulose column in the ratio of 200 mg per 2.5x10 cm 
column. The products were differentially eluted by 
sodium chloride during application of a 0 to 500 mM 
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linear NaCl gradient. 

The suUitolyzed biosynthetic human proinsulin 
.MP1-S80,) was readily identifiable by HPLC-analysis 

lonri jjw^/ nupT-SSO WdS 

of the salt gradient elution profile. BHPI SSOj 

. , A a * nH 4 5 to remove formamide 
collected and analyzed at pH 4 5 to BBPI . S S0 3 
and to precipitate the product. The pure 3 
precipitate was dissolved in RH buffer, containing 50 
"J, glycine and 5 mH EDTA lP H 10.5,, at 0.5 mg/ml nd 

* ^ • i ,t=» c =^ed to 800 Aim and 

cooled to -2°C. Ethanethiol was added to ju 

^ for 24 hours to effect refolding of 
allowed to react for 24 nours *. 

, ^ 0 The reaction was stopped by adjusting 
the molecule. Tne re^" u 

th e P H to 7 with B« and the reaction product was 
analysed. Hefolded biosynthetic 

(BBP1) constituted the monomeric component of the 
reaction P^ct bu££eIi ctmta ining 

The BHPI was dissolved m 
10 0 * Tris-HCl (PH 7.6, and 10 «. Ca« 2 . at 2 mg/ml 
10 W ,/ml of trypsin and 40 *,/»! of 

were added, and the mixture was allowed to react for 

were au^c , reaction was 

30 minutes at room temperature. The r * act 

,~~Hr acid to 1* and biosynthetic 
stopped by adding acetic acia to 

• n„ (BHD was recovered by precipitation, 
human insulin IBH1 1 was on 

„i»ld was 98% of theoretical, based on 
The recovered yield wab 

the amount of BHPI used. 

The BHI was consiaered authentic on the basis of 
HPLC analysis, amino acid composition, amino acia 
sequencing ana biological activity. Human proinsulin 
„ the form of tandemly- linked units, therefore is 
stably expressed from transformed cells and human 
insulin can be readily formed from the expressed 

^^In' summary of this disclosure, the present 
invention provides a nove! procedure for expressing 
polypeptides from organisms, which involves proving 
a DNA fragment containing a multiple number of 
oUgonudeotiae sequences each coding f or a desired 
polypeptide product and tandemly-lmked by 
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oligonucleotides coding for easily-cleavable amino 
acid groupings, forming the fragment into a 
recombinant plasmid and transforming an organism by 
the plasmid. Modifications are possible within the 
scope of this invention. 
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CLAIMS 

„«♦■ havina an oligonucleotide sequence 
i A DNA fragment having an ^ 

„<„„ for a polypeptide composed of repeating units 
coong for a po yP P ^ sequence 

of a desired product at ^ 
tandemly-UnXed by an oligonucleotide coding or 
easily-deavable amino acid sequence and having no 
cols for stop and start signals between the ends of 

said fragment. ,. 0 «iipable 

2 The fragment claimed in claim 1 in a replicable 

kl » «f pvnressinq the fragment in 
cloning vehicle capable of expressing 

a microbial organism. 

3 The fragment claimed in claim 1 or 

• * • + v^t the desired product is human 
characterized in that the aesir* F 

proinsulin. , 

i • _j <; n =nv one of claims 1 to J, 
i The fragment claimed in any one 

. • a hv from 2 to 6 repeating units each 
characterized by from ^ * r , , oni1v _Hnked 

. f „i; n an d beinq tandemly-J.mK.eu 

coding for human proinsulin ana oei y 

CUU1 ' , . „ pasiiv-cleavable 

through oligonucleotides coding for easily 

amino acid sequences. 0 l aS mid 

nlasmid, comprising a plasmia 
) 5 A recombinant piasmia, r 

vector and an expressible DNA fragment Ixg.ted 
thereto, characterized in that the DNA ^^ent 

a of repeating units coding for a desired 
composed of repeating nce ■ tandemly-linked 

product of known ammo acid seque vafele 
5 by oligonucleotides coding for an easily 

amino acid sequence. ^o^ized in 

«. The plasmid claimed in cl.l- » ="«'«"»" a ^ 
that the desired product is human proinsulin. the DNA 
ragment is joined to the 3-side of a lac op, «o» 
30 fragment encoding the first approximately SO amino 

acids of f -galactosidase. and the fragment contaxns 
from 2 to 5 tandemly-linXed oligonucleotide sequences 
coding for human proinsulin. 

, The plasmid claimed in claim 5 characterized in 
35 •- that the desired product is human proinsulin, the DNA 
-fragment is directly linked to a fragment of the 
' doling vector containing the Tac promoter followed by 
a bacterial Shine-Delgarno sequence, and the fragment 
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contains from 3 to 6 tandemly-linked oligonucleotide 
sequences coding for human proinsulin. 

8. A microorganism which has been transformed by the 
plasmid claimed in any one of claims 5 to 7 for 
expression of said DNA fragment therefrom. 

9. A method of producing polypeptides by 
microorganisms, characterized by: 

(a) cleaving a vector at a predetermined 

restriction site; 

(b) preparing an insert to the cleaved vector 
which has multiple copies of the gene for the _ 
polypeptide separated by oligonucleotide sequences 
coding for easily-cleavable amino acid sequences; 

(c) ligating the insert to the cleaved vector to 
form a recombinant plasmid containing the insert; 

(d) transforming a microorganism with the 

recombinant plasmid; 

(e) growing said microorganism in a culture medium to 
effect expression of a DNA fragment containing said 
multiple copies of the coding sequence for the 
polypeptide ; 

(f) isolating from the microorganism a fraction 

* 

constituting a polymer composed of the multiple linked 
copies of the polypeptide; and 

(g) cleaving the isolated multiple copy - 
polypeptide to single polypeptides. 

10. The method claimed in claim 9, characterized in 
that the polypeptide is human proinsulin. 

11. The method claimed in claim 10, characterized in 
that the human proinsulin is converted into human 

insulin. 

12. The method claimed in any one of claims 9 to 11, 
characterized in that step (f) includes sonic 
disruption of cells of the microorganism separated 
from the culture medium. 
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FIG.3 
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FIG.5. 
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